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Abstract 

We show that expander codes, when properly instantiated, are high-rate list recoverable codes with 
linear-time list recovery algorithms. List recoverable codes have been useful recently in constructing 
efficiently list-decodable codes, as well as explicit constructions of matrices for compressive sensing and 
group testing. Previous list recoverable codes with linear-time decoding algorithms have all had rate at 
most 1/2; in contrast, our codes can have rate 1 — e for any e > 0. We can plug our high-rate codes 
into a construction of Meir (2014) to obtain linear-time list recoverable codes of arbitrary rates R, which 
approach the optimal trade-off between the number of non-trivial lists provided and the rate of the code. 

While list-recovery is interesting on its own, our primary motivation is applications to list-decoding. 
A slight strengthening of our result would implies linear-time and optimally list-decodable codes for all 
rates. Thus, our result is a step in the direction of solving this important problem. 


1 Introduction 

In the theory of error correcting codes, one seeks a code C C F" so that it is possible to recover any codeword 
c £ C given a corrupted version of that codeword. The most standard model of corruption is from errors: 
some constant fraction of the symbols of a codeword might be adversarially changed. Another model of 
corruption is that there is some uncertainty: in each position i £ [n], there is some small list Si C F of 
possible symbols. In this model of corruption, we cannot hope to recover c exactly; indeed, suppose that 
Si = {c.;,c'} for some codewords c,c' £ C. However, we can hope to recover a short list of codewords that 
contains c. Such a guarantee is called list recoverability. 

While this model is interesting on its own—there are several settings in which this sort of uncertainty 
may arise—one of our main motivations for studying list-recovery is list-decoding. We elaborate on this more 
in Section 1.1 below. 

We study the list recoverability of expander codes. These codes—introduced by Sipser and Spielman 
in [SS96]— are formed from an expander graph and an inner code Cq. One way to think about expander 
codes is that they preserve some property of Co, but have some additional useful structure. For example, 
[SS96] showed that if Co has good distance, then so does the the expander code; the additional structure of 
the expander allows for a linear-time decoding algorithm. In [HOW14], it was shown that if Co has some 
good (but not great) locality properties, then the larger expander code is a good locally correctable code. 
In this work, we extend this list of useful properties to include list recoverability. We show that if Co is a 
list recoverable code, then the resulting expander code is again list recoverable, but with a linear-time list 
recovery algorithm. 

1.1 List recovery 

List recoverable codes were first studied in the context of list-decoding and soft-decoding: a list recovery 
algorithm is at the heart of the celebrated Guruswami-Sudan list-decoder for Reed-Solomon codes [GS99] 
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and for related codes [GR08]. Guruswami and Indyk showed how to use list recoverable codes to obtain 
good list- and uniquely-decodable codes [GI02, GI03, GI04], More recently, list recoverable codes have been 
studied as interesting objects in their own right, and have found several algorithmic applications, in areas 
such as compressed sensing and group testing [NPR12, INR10, GNP+13]. 

We consider list recovery from erasures, which was also studied in [Gur03, GI04]. That is, some fraction 
of symbols may have no information; equivalently, Si = F for a constant fraction of i £ [n]. Another, stronger 
guarantee is list recovery from errors. That is, Ci Si for a constant fraction of i £ [n]. We do not consider 
this stronger guarantee here, and it is an interesting question to extend our results for erasures to errors. 
It should be noted that the problem of list recovery is interesting even when there are neither errors nor 
erasures. In that case, the problem is: given S) C F, find all the codewords c £ C so that c* £ Si for all i. 
There are two parameters of interest. First, the rate R := \og q {\C\)/n of the code: ideally, we would like the 
rate to be close to 1. Second, the efficiency of the recovery algorithm: ideally, we would be able to perform 
list-recovery in time linear in n. We survey the relevant results on list recoverable codes in Figure 1. While 
there are several known constructions of list recoverable codes with high rate, and there are several known 
constructions of list recoverable codes with linear-time decoders, there are no known prior constructions of 
codes which achieve both at once. 

In this work, we obtain the best of both worlds, and give constructions of high-rate, linear-time list 
recoverable codes. Additionally, our codes have constant (independent of n) list size and alphabet size. As 
mentioned above, our codes are actually expander codes—in particular, they retain the many nice properties 
of expander codes: they are explicit linear codes which are efficiently (uniquely) decodable from a constant 
fraction of errors. 

We can use these codes, along with a construction of Meir [Meil4], to obtain linear-time list recoverable 
codes of any rate R, which obtain the optimal trade-off between the fraction 1 — a of erasures and the rate 
R. More precisely, for any R £ [0,1 },t £ N, and 77 > 0, there is some L = L(r],£) so that we can construct 
rate R codes which are (R + rj,£, L)-list recoverable in linear time. The fact that our codes from the previous 
paragraph have rate approaching 1 is necessary for this construction. To the best of our knowledge, linear¬ 
time list-decodable codes obtaining this trade-off were also not known. 

It is worth noting that if our construction worked for list recovery from errors, rather than erasures, then 
the reduction above would obtain linear-time list decodable codes, of rate R and tolerating 1 — A?, — 77 errors. 
(In fact, it would yield codes that are list-recoverable from errors, which is a strictly stronger notion). So far, 
all efficiently list-decodable codes in this regime have polynomial-time decoding algorithms. In this sense, 
our work is a step in the direction of linear-time optimal list decoding, which is an important open problem 
in coding theory. 1 

1.2 Expander codes 

Our list recoverable codes are actually properly instantiated expander codes. Expander codes are formed 
from a d-regular expander graph, and an inner code Co of length d, and are notable for their extremely fast 
decoding algorithms. We give the details of the construction below in Section 2. The idea of using a graph 
to create an error correcting code was first used by Gallager [Gal63], and the addition of an inner code was 
suggested by Tanner [Tan81]. Sipser and Spielman introduced the use of an expander graph in [SS96]. There 
have been several improvements over the years by Barg and Zernor [ZernOl, BZ02, BZ05, BZ06]. 

Recently, Hemenway, Ostrovsky and Wootters [HOW14] showed that expander codes can also be locally 
corrected, matching the best-known constructions in the high-rate, high-query regime for locally-correctable 
codes. That work showed that as long as the inner code exhibits suitable locality, then the overall expander 
code does as well. This raised a question: what other properties of the inner code does an expander code 

1 In fact, adapting our construction to handle errors, even if we allow polynomial-time decoding, is interesting. First, 
it would give a new family of efficiently-decodable, optimally list-decodable codes, very different from the existing algebraic 
constructions. Secondly, there are no known uniformly constructive explicit codes (that is, constructible in time poly(n) • C r/ ) 
with both constant list-size and constant alphabet size—adapting our construction to handle errors, even with polynomial-time 
recovery, could resolve this. 
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preserve? In this work, we show that as long as the inner code is list recoverable (even without an efficient 
algorithm), then the expander code itself is list recoverable, but with an extremely fast decoding algorithm. 

It should be noted that the works of Guruswami and Indyk cited above on linear-time list recovery are 
also based on expander graphs. However, that construction is different from the expander codes of Sipser 
and Spielman. In particular, it does not seem that the Guruswami-Indyk construction can achieve a high 
rate while maintaining list recoverability. 

1.3 Our contributions 

We summarize our contributions below: 

1 . The first construction of linear-time list-recoverable codes with rate approaching 1. As 

shown in Figure 1, existing constructions have either low rate or substantially super-linear recovery 
time. The fact that our codes have rate approaching 1 allows us to plug them into a construction 
of [Meil4], to achieve the next bullet point: 

2. The first construction of linear-time list-recoverable codes with optimal rate/erasure 
trade-off. We will show in Section 3.2 that our high-rate codes can be used to construct list-recoverable 
codes of arbitrary rates R , where we are given information about only an R + e fraction of the symbols. 
As shown in Figure 1, existing constructions which achieve this trade-off have substantially super-linear 
recovery time. 

3. A step towards linear-time, optimally list decodable codes. Our results above are for list- 
recovery from erasures. While this has been studied before [GI04] , it is a weaker model than a standard 
model which considers errors. As mentioned above, a solution in this more difficult model would lead 
to algorithmic improvements in list decoding (as well as potentially in compressed sensing, group 
testing, and related areas). It is our hope that understanding the erasure model will lead to a better 
understanding of the error model, and that our results will lead to improved list decodable codes. 

4. New tricks for expander codes. One take-away of our work is that expander codes are extremely 
flexible. This gives a third example (after unique- and local- decoding) of the expander-code construc¬ 
tion taking an inner code with some property and making that property efficiently exploitable. We 
think that this take-away is an important observation, worthy of its own bullet point. It is a very 
interesting question what other properties this may work for. 


2 Definitions and Notation 

We begin by setting notation and defining list recovery. An error correcting code is (a, l, L ) list recoverable 
(from errors) if given lists of l possible symbols at every index, there are at most L codewords whose symbols 
lie in a a fraction of the lists. We will use a slightly different definition of list recoverability, matching the 
definition of [GI04]: to distinguish it from the definition above, we will call it list recoverability from erasures. 

Definition 1 (List recoverability from erasures). An error correcting code C C F™ is (a, t, L)-list recoverable 
from erasures if the following holds. Fix any sets Si ,..., S n with Si C F g , so that |S)| < £ for at least an of 
the i’s and Si = F g for all remaining i. Then there are most L codewords c € C so that c £ Si x S 2 x ■ • • x S n . 

In our study of list recoverability, it will be helpful to study the list cover of a list S C F”: 

Definition 2 (List cover). For a list 5cFJ, the list cover of S is 

LC(S) = {{ Cl : c e 5})” =1 • 


The list cover size is max ie y \LC(S)i\. 
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a 
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1-7 

o(th) 
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£0(1/ T ) 

£0(1/7) 

1 - o( 7 ) 

1 - o( 7 ) 



Random linear code 

[Gur04] 

1-7 

f>W-y 2 ) 

£0(1/7) 

1 - o( 7 ) 


L 

Folded Reed-Solomon 
codes [GR08] 
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n o(io g (r)/7) 

n O(log(t)/7 2 ) 

1 - o( 7 ) 

n O( log(<?)/ 7 2 ) 

EL 

Folded RS subcodes: 
evaluation points in an 
explicit subspace-evasive 
set [DL121 

1-7 

(l/ 7 ) 0 (C/ 7 ) 

n°('/7 2 ) 

1 - o( 7 ) 

n °( l/ i 2 ) 

E 
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evaluation points in a 
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set [Gurlll 

1-7 


n°<*/7 2 ) 

1 - o( 7 ) 

n °W 1 2 ) 


(Folded) AG 
subcodes [GX12, GX131 

1 - 7 

o{th) 

exp( 0 (£/ 7 2 )) 

1 - o( 7 ) 

G tn n° m 


[GI03] 

2 - 2 o« 

1 

,2 od) 

2 2 

0 i<HD 

1 - 2 -2 

Q(n) 

E 

[GI04] 

r °w 

i 

2^ 

.999 (*) 

Q(n) 

E 

This work 

1-7 


£0(1/7) 

1 - 0 ( 7 3 ) (*) 

0(n) 

EL 


Figure 1: Results on high-rate list recoverable codes and on linear-time decodable list recoverable codes. Above, n is 
the block length of the (a, l, L)-list recoverable code, and 7 > 0 is sufficiently small and independent of n. Agreement 
rates marked (*) are for erasures, and all others are from errors. An empty “recovery time” field means that there 
are no known efficient algorithms. We remark that [GX13], along with the explicit subspace designs of [GK13], also 
give explicit constructions of high-rate AG subcodes with polynomial time list-recovery and somewhat complicated 
parameters; the list-size L becomes super-constant. 

The results listed above of [GR08, Gurll, DL12, GX12, GX13] also apply for any rate R and agreement 
R + 7 . In Section 3.2, we show how to acheive the same trade-off (for erasures) in linear time using our codes. 
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Our construction will be based on expander graphs. We say a d-regular graph H is a spectral expander 
with parameter A, if A is the second-largest eigenvalue of the normalized adjacency matrix of H. Intuitively, 
the smaller A is, the better connected H is—see [HLW06] for a survey of expanders and their applications. 
We will take H to be a Ramanujan graph , that is A < 2v/ ^~ 1 ; explicit constructions of Ramanujan graphs 
are known for arbitrarily large values of d [LPS 88 , Mar 88 , Mor94]. For a graph, H , with vertices V{H) and 
edges E(H ), we use the following notation. For a set S C V(H), we use r(S) to denote the neighborhood 

T{S) = {v : 3u£S, (u, v) G E{H)j . 

For a set of edges F C E(H ), we use Tp(S) to denote the neighborhood restricted to F: 

T f{S) = {v : 3u £ S, (u, v) £ F} . 

Given a d-regular H and an inner code Co, we define the Tanner code C(H,Cq) as follows. 

Definition 3 (Tanner code [Tan81]). If H is a d-regular graph on n vertices and C 0 is a linear code of block 

Tpl TT\ 

length d, then the Tanner code created from Co and H is the linear code C C F, , where each edge H is 
assigned a symbol in F g and the edges adjacent to each vertex form a codeword in Co. 

C = {ce FfW : W € V(H),c\ r{v) G C 0 } 

Because codewords in Co are ordered collections of symbols whereas edges adjacent to a vertex in H may 
be unordered, creating a Tanner code requires choosing an ordering of the edges at each vertex of the graph. 
Although different orderings lead to different codes, our results (like all previous results on Tanner codes) 
work for all orderings. As our constructions work with any ordering of the edges adjacent to each vertex, we 
assume that some arbitrary ordering has been assigned, and do not discuss it further. 

When the underlying graph H is an expander graph , 2 we call the resulting Tanner code an expander 
code. Sipser and Spielman showed that expander codes are efficiently uniquely decodable from about a S § 
fraction of errors. We will only need unique decoding from erasures; the same bound of 5q obviously holds 
for erasures as well, but for completeness we state the following lemma, which we prove in Appendix A. 

Lemma 1. If Cq is a linear code of block length d that can recover from an Sod number of erasures, and H 
is a d-regular expander with normalized second eigenvalue X, then the expander code C can be recovered from 
a % fraction of erasures in linear time whenever A < So — f • 

Throughout this work, Co C F^ will be (op, L)-list recoverable from erasures, and the distance of Co is 
do- We choose H to be a Ramanujan graph, and C = C(H,Co) will be the expander code formed from H and 
Co- 

3 Results and constructions 

In this section, we give an overview of our constructions and state our results. Our main result (Theorem 2) is 
that list recoverable inner codes imply list recoverable expander codes. We then instantiate this construction 
to obtain the high-rate list recoverable codes claimed in Figure 1. Next, in Theorem 5 we show how to 
combine our codes with a construction of Meir [Meil4] to obtain linear-time list recoverable codes which 
approach the optimal trade-off between a and R. 

3.1 High-rate linear-time list recoverable codes 

Our main theorem is as follows. 

2 Although many expander codes rely on bipartite expander graphs (e.g. [ZemOl]), we find it notationally simpler to use the 
non-bipartite version. 
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Theorem 2. Suppose that Cq is (ao,£, L)-list recoverable from erasures, of rate Ro, length d, and distance 
So, and suppose that H is a d-regular expander graph with normalized second eigenvalue A, if 


A < 


JL 

12£ L 


Then the expander code C formed from Co and H has rate at least 2 Rq — 
from erasures, where 


L' < exp| 



1 and is ( a,£, L')-list recoverable 


and a satisfies 


1 — a > (1 — ap) 




Further, the running time of the list recovery algorithm is OL£ t $ 0t d{n). 


Above, the notation exp^(-) means Before we prove Theorem 2 and give the recovery algorithm, we 
show how to instantiate these codes to give the parameters claimed in Figure 1. We will use a random linear 
code as the inner code. The following theorem about the list recoverability of random linear codes follows 
from a union bound argument (see Guruswami’s thesis [Gur04]). 


Theorem 3 ([Gur04]). For any q > 2, for all 1 < £ < 2, and for all L > £, and for all ao G (0, 1], a random 
linear code of rate Rq is (a,£, L)-list recoverable, with high probability, as long as 


R ° 2 W) ( Qolg<,/(,) “" ( “ o) “ H(w io S] (l+i) ) “ 0(1) - 

For any 7 > 0, and any (small constant) f > 0, choose 


q = exp £ 



and 



and ao = 1 — 7(1 — 3£). 


(1) 


Then Theorem 3 asserts that with high probability, a random linear code of rate i ? 0 = 1 — 7 is («oi A)-list 
recoverable. Additionally, with high probability a random linear code with the parameters above will have 
distance Ao = 7(1 + 0( 7 )). By the union bound there exists an inner code Co with both the above distance 
and the above list recoverability. 

Plugging all this into Theorem 2, we get explicit codes of rate 1 — 2y which are (a, £, L')-list recoverable 
in linear time, for 

L’ = exp f ( 7 " 4 exp^ (exp f (C^/ 7 2 ))) 


for some constant C = C((), and 


a = l — 


1 — 3C 


7 3 - 


This recovers the parameters claimed in Figure 1. Above, we can choose 


d = O [ —j- 


£ 2L 
7 


so that the Ramanujan graph would have parameter A obeying the conditions of Theorem 2. Thus, when £, 7 
are constant, so is the degree d, and the running time of the recovery algorithm is linear in n, and thus in the 
block length nd of the expander code. Our construction uses an inner code with distance = 7(1 + O(y)). 
It is known that if the inner code in an expander graph has distance So, the expander code has distance at 
least S2(<5 2 ) (see for example Lemma 1). Thus the distance of our construction is S = ^(y 2 ). 
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Remark 1. Both the alphabet size and the list size L' are constant, if £ and 7 are constant. However, L' 
depends rather badly on t, even compared to the other high-rate constructions in Figure 1. This is because the 
bound (1) is likely not tight; it would be interesting to either improve this bound or to give an inner code with 
better list size L. The key restrictions for such an inner code are that (a) the rate of the code must be close 
to 1; (b) the list size L must be constant, and (c) the code must be linear. Notice that (b) and (c) prevent 
the use of either Folded Reed-Solomon codes or their restriction to a subspace evasive set, respectively. 

3.2 List recoverable codes approaching capacity 

We can use our list recoverable codes, along with a construction of Meir [Meil4], to construct codes which 
approach the optimal trade-off between the rate R and the agreement a. To quantify this, we state the 
following analog of the list-decoding capacity theorem. 

Theorem 4 (List recovery capacity theorem). For every R > 0, and L > £, there is some code C of rate R 
over F g which is (R + rj(£, L),£, L)-list recoverable from erasures, for any 

r](£,L)>^- and q>i 2 ^ v . 

±J 

Further, for any constants rj, R > 0 , any integer £, any code of rate R which is (R — 77 , £, L)-list recoverable 
from erasures must have L = q^ n l . 

The proof is given in Appendix B. Although Theorem 4 ensures the existence of certain list-recoverable 
codes, the proof of Theorem 4 is probabilistic, and does not provide a means of efficiently identifying (or 
decoding) these codes. Using the approach of [Meil4] we can turn our construction of linear-time list 
recoverable codes into list recoverable codes approaching capacity. 

Theorem 5. For any R > 0 , £ > 0 , and for all sufficiently small 77 > 0, there is some L, depending only 
on L and r], and some constant d, depending only on 77, so that whenever q > £ 6 ^ v there is a family of 
(a, l, L)-list recoverable codes C C F” d with rate at least R, for 

a = R + rj. 

Further, these codes can be list-recovered in linear time. 

We follow the approach of [Meil4], which adapts a construction of [AL96] to take advantage of high-rate 
codes with a desirable property. Informally, the takeaway of [Meil4] is that, given a family of codes with 
any nice property and rate approaching 1, one can make a family of codes with the same nice property that 
achieves the Singleton bound. For completeness, we describe the approach below, and give a self-contained 
proof. 

Proof of Theorem 5. Fix R, £, and 77. Let a = R+r] as above, and suppose q > £ 2 ^ v . Let R 0 = a— ^ = R +1 
and Ri = 1 — |. We construct the code C from three ingredients: an “outer” code Ci that is a high-rate 
list recoverable code with efficient decoding, a bipartite expander, and a short “inner” code that is list 
recoverable. More specifically, the construction relies on: 

1 . A high-rate outer code C\. Concretely, C\, will be our expander-based list recoverable codes guaranteed 
by Theorem 2 in Section 3.1. The code Ci C F™ will be of rate R\ = 1 — 77/3, and which is (a.\,£\,Li)- 
list recoverable from erasures for oq = 1 — Ofrf) and Li = Li(j],£i ) depends only on 77, l\. The distance 
of this code is di = U(?7 2 ). Note that the block-length, m, is specified by the choice of R\ and £\. 

2. A bipartite expander graph G = (U, V , E) on 2 • m/ ( Rod ) =: 2 n vertices, with degree d , which has the 
following property: for at least a.\n of the vertices in U, 

|F(u) n A\ > (\A\/n - r)/3)d , 

for any set A C V . Such a graph exists with degree d that depends only on oq and 77, and hence only 
on 77. 
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Figure 2: The construction of [Meil4]. (1) Encode x with C\. (2) Bundle symbols of Ci(*) into groups of size Rod. 
(3) Encode each bundle with Co- (4) Redistribute according to the n x n bipartite graph G. If (u,v) £ E(G), and 
F i (u) = v and Tj ( v ) = u then we define = z^. 


3. A code Cg C of rate Rq, which is (a — r]/ 3,£, £i)-list recoverable, where 


Ro — a ——, 


h = 


121 

1 


as in the first part of Theorem 4. Although the codes guaranteed by Theorem 4 do not come with 
decoding algorithms, we will choose d to be a constant, so the code Cq can be list recovered in constant 
time by a brute-force recovery algorithm. 


We remark that several ingredients of this construction share notation with ingredients of the construction in 
the previous section (the degree d , code C 0 , etc), although they are different. Because this section is entirely 
self-contained, we chose to overload notation to avoid excessive sub/super-scripting and hope that this does 
not create confusion. The only properties of the code C\ from the previous section we use are those that are 
listed in Item 1. 

The success of this construction relies on the fact that the code C\ can have rate approaching 1 (specifically, 
rate 1 — r]/3). The efficiency of the decoding algorithm comes from the efficiency of decoding C\\ since C\ 
has linear time list recovery, the resulting code will also have linear time list recovery. 

We assemble these ingredients as follows. To encode a message x £ F^ im , we first encode it using C i, to 
obtain y £ F™. Then we break [m] into n := m/(Rod) blocks of size Rod, and write y = ( y ... ,y^ n ' > ) for 
yb) g F ^° d . We encode each part j/W using Co to obtain £ F^. Finally, we “redistribute” the symbols 
of z = (.sp 1 ), ... , 2 V 1 )) according to the expander graph G to obtain a codeword c £ (F^) n as follows. We 
identify symbols in 2 with left-hand vertices, U , in G and symbols in c with right-hand vertices, V, in G. 
For any right-hand vertex, v £ V then the r>th symbol of c is 

c v = (ai, ...,a d )£Fg 

the di are defined such that if T^t;) = u and T^w) = v, then Oj = Zj U \ Intuitively, the d components of 
are sent out on the d edges defined by r(u) and the d components of are the d symbols coming in on 
the d edges defined by r(u). 

















It is easy to verify that the rate of C is 

R = Rq ■ R \ = (a — 2?7/3)(1 — 77/3) >0 — 77. 

Next, we give the linear-time list recovery algorithm for C and argue that it works. Fix a set A C V of an 
coordinates so that each v € A has an associated list S v C of size at most £. First, we distribute these 
lists back along the expander. Let B C U be the set of vertices u so that 

|r(u) n A\ > {\A\/n — 77/3) d = (a — r//3)d. 

The structure of G ensures that \B\ > a.\n. For each of the vertices u £ B, the corresponding codeword z^ 
of Cq has at least an ( a — rj/3) fraction of lists of size £. Thus, for each such u £ B 1 we may recover a list T u 
of at most i\ codewords of Co which are candidates for Notice that because Co has constant size, this 
whole step takes time linear in n. These lists T u induce lists T) of size £\ for at least an ay fraction of the 
indices i £ [m]. Now we use the fact that C\ can be list recovered in linear time from aim such lists; this 
produces a list of L\ possibilities for the original message x, in time linear in m (and hence in n = m/(Rod)) 
where L \ depends only on a \ and £\. Tracing backwards, ay depends only on 77, and £\ depends on £ and 77. 
Thus, L 1 is a constant depending only on £ and 77, as claimed. □ 


4 Recovery procedure and proof of Theorem 2 

In the rest of the paper, we prove Theorem 2, and present our algorithm. The list recovery algorithm is 
presented in Algorithm 2, and proceeds in three steps, which we discuss in the next three sections. 

1 . First, we list recover locally at each vertex. We describe this first step and set up some notation in 4.1. 

2. Next, we give an algorithm that recovers a list of £ ways to choose symbols on a constant fraction of 
the edges of H, using the local information from the first step. This is described in Section 4.2, and 
the algorithm for this step is given as Algorithm 1. 

3. Finally, we repeat Algorithm 1 a constant number of times (making more choices and hence increasing 
the list size) to form our final list. This third step is presented in Section 4.3. 

Fix a parameter £ > 0 to be determined later. 3 Set 

a*=a*(a 0 ):=l-e£ L (l^). ( 2 ) 

V 2 - «0 / 

We assume that a > a*. We will eventually choose e so that this requirement becomes the requirement in 
Theorem 2. 


4.1 Local list recovery 


In the first part of Algorithm 2 we locally list recover at each “good” vertex. Below, we define “good” 
vertices, along with some other notation which will be useful for the rest of the proof, and record a few 
consequences of this step. 

For each edge e £ E{H) 1 we are given a list £ e , with the guarantee that at least an a > a* fraction of 
the lists £ e are of size at most £. We call an edge good if its list size is at most £, and bad otherwise. Thus, 
there are at least a 


/3 = /3(a 0 ) := 1 - 


1 — a* 
2(1-a 0 ) 


( 3 ) 


3 For reference, we have included a table of our notation for the proof of Theorem 2 in Figure 3. 
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(atoJ,L) 

List recovery parameters of inner code Co 

So 

The inner code Co can recover from do fraction of erasures 

n 

Number of vertices in the graph G 

d 

Degree of the graph (and length of inner code) 

A 

Normalized second eigenvalue of the graph 

£ 

A parameter which we will choose to be 5q/( 2M l ). We will find a large sub¬ 
graph H' C FI so that every equivalence class in FT has size at least ed. 

k 

Parameter such that k > 2 . 

_op —A 

a 

The final expander code is list recoverable from an a fraction of erasures 

% 

a 

Bound on the agreement a. We set a* = 1 — el L > we assume a > a* 

p 

Bound on the fraction of bad vertices. We set B = 1 — x 

^ 2(1 —ao) 


Figure 3: Glossary of notation for the proof of Theorem 2. 


fraction of vertices which have at least ayd good incident edges. Call these vertices good, and call the rest 
of them bad. For a vertex v, define the good neighbors G(v) C r(u) by 


G(v) 


r(i>) v is bad 

{u £ F(u) : ( v,u) is good } v is good 


Now, the first step of Algorithm 2 is to run the list recovery algorithm for Co on all of the good vertices. 
Notice that because Co has constant size, this takes constant time. We recover lists S v at each good vertex v. 
For bad vertices v , we set S v = Co for notational convenience (we will never use these lists in the algorithm). 
We record the properties of these lists S v below, and we use the shorthand (ao,£, L)-legit to describe them. 

Definition 4. A collection {S-uj^gy^ of sets S v C Co is (ao,£, L)-legit if the following hold. 

1. For at least fin vertices v (the good vertices), |i5>„| < L. 

2. For every good vertex v, at most (1 — ao)d indices i £ [d] have list-cover size \LC(S v )i\ < £. 

3. There are at most a (1 — a*) + 2(1 — /3) fraction of edges which are either bad or adjacent to a bad 
vertex. 


Above, fd is as in Equation (3), and a satisfies the assumption (2). 

The above discussion implies that the sets {£„} in Algorithm 2 are (a 0 ,£,L)~ legit. 


4.2 Partial recovery from lists of inner codewords 

Now suppose that we have a collection of (a 0 , £, L)-legit sets We would like to recover all of the 

codewords in C(H,Cq) consistent with these lists. The basic observation is that choosing one symbol on one 
edge is likely to fix a number of other symbols at that vertex. To formalize this, we introduce a notion of 
local equivalence classes at a vertex. 

Definition 5 (Equivalence Classes of Indices). Let {<S>„} be L)-legit and fix a good vertex v £ V(H). 

For each u £ G(v), define 

<P {U) ■ S v LC{S V ) U C F, 

c i— > c u (the uth symbol of codeword c) 
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Define an equivalence relation on G(v) by 


u v! <t=> there is a permutation n : F 9 —> F g so that i r o 0 U ' > = ip( u K 

For notational convenience, for u ^ G{v), we say that u is equivalent to itself and nothing else. Define 
£(v,u ) C E(H) to be the (local) equivalence class of edges at v containing (v,u): 

£(v,u ) = {( v,u') : v! u} . 

For u £ G(v), \£(v,u)\ = 1, and we call this class trivial. 

It is easily verified that is indeed an equivalence relation on T(u), so the equivalence classes are 
well-defined. Notice that is specific to the vertex v: in particular, £(v,u) is not necessarily the same as 
£{u,v). For convenience, for bad vertices v, we say that £(v,u) = {(u,u)} for all u £ F(u) (all of the local 
equivalence classes at v are trivial). 

We observe a few facts that follow immediately from the definition of 

1 . For each u £ G(v), we have |LC(<S„)„| < £ by the assumption that S v is legit. Thus, there are most £ L 
choices for 0 U \ and so there are at most t L nontrivial equivalence classes £(v,u). (That is, classes of 
size larger than 1). 

2. The average size of a nontrivial equivalence class £{v,u) is at least ^0. 

3. If u u', then for any c £ S v the symbol c u determines the symbol c u i. Indeed, c u = tt(c u >) where ir 
is the permutation in the definition of ~ u . In particular, learning the symbol on {v,u) determines the 
symbol on ( v,u') for all u' u. 

The idea of the partial recovery algorithm follows from this last observation. If we pick an edge at 
random and assign it a value, then we expect that this determines the value of about a$d/l L other edges. 
These choices should propagate through the expander graph, and end up assigning a constant fraction of the 
edges. We make this precise in Algorithm 1. To make the intuition formal, we also define a notion of global 
equivalence between two edges. 

Definition 6 (Global Equivalence Classes). For an expander codeC(H, Co), and L)-legit lists 
we define an equivalence relation ~ as follows. For good vertices a , b , u, v, we say 

(a, b) ~ (u, v) 

if there exists a path from (a, b) to (u, v) where each adjacent pair of edges is in its local equivalence relation, 
i.e., there exists (wq = a, w\ = b), (w\,W 2 ), ■ ■ ■, {w n - 2 , (w n -1 = u, w n = v) so that 

(wi,Wi- 1 - 1 ) € £(uii+i,Wi+ 2 ) for i = 0, ..., n — 2 and (Wi , 1 ^+ 1 ) is good for all i = 0, ..., n — 1. 

Let£( 1 U v) C E(H) denote the global equivalence class of the edge (u,v). 

It is not hard to check that this indeed forms an equivalence relation on the edges of Ft and that a single 
decision about which of the t symbols appears on an edge (u, v) forces the assignment of all edges in £P v) . 

Algorithm 1 takes an edge (u, v ) and iterates through all £ possible assignments to ( u, v) and turns this 
into £ possible assignments for the vectors (c e ) ee£ H . In order for Algorithm 1 to be useful, the graph Ft 

should have some large equivalence classes. Since each good vertex has at most £ L nontrivial equivalence 
classes which partition its > a®d good edges, most of the nontrivial local equivalence classes are larger than 
^0. This means that a large fraction of the edges are themselves contained in large local equivalence classes. 
This is formalized in Lemma 6. 
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Algorithm 1: Partial decision algorithm 


Input: Lists S v C Co which are (cto, i, L)-legit, and a starting good edge {u,v), where both u and v 
are good vertices. 

Output: A collection of at most t partial assignments xf a ' 1 £ (F g U for 

<7 £ LC(S V ) U n LC(S U ) V . 

1 for a £ LC(S V ) U (~l LC(S U ) V do 

2 Initialize x^ = (_L,..., _L) £ (¥ q U {.L})^^ 

3 for (v,u r ) £ £(v,u) do 

4 | Set to the only value consistent with the assignment x^y = a 

5 end 

6 for (v',u) £ £{u,v) do 

7 | Set Xy y to the only value consistent with the assignment x^y = a 

8 end 

9 Initialize a list Wq = {u,v}. 

10 for t = 0,1,... do 

n If W t = 0, break. 

12 W t+ i = 0 

13 for a £ W t do 

14 for b £ r(a) where x^\^ / 1 do 

is Add b to W t +i 

16 for (b, c) £ £(b, a) do 

17 | Set to the only value consistent with the assignment C( a ,b) = x [a\) 

is end 


22 end 

23 return x ^ for a £ S v (u). 





Lemma 6 . Suppose that {£>„} is (cco, £■, L)-legit, and consider the local equivalence classes defined with respect 
to S v . There a large subgraph H' of H so that H' contains only edges in large local equivalence classes. In 
particular, 

• V{H') = V(H), 

• for all (v, u) £ E(H'), \£(v, u ) D E{H')\ > ed, 

• \E(H')\> (f) (1-3 e£ L ). 

Proof. Consider the following process: 

• Remove all of the bad edges from H , and remove all of the edges incident to a bad vertex. 

• While there are any vertices v,u £ V(H) with |£(d,u)| < ed: 

— Delete all classes £(v,u ) with \£{v,u)\ < ed. 

We claim that the above process removes at most a 

2I L e + (1 — a*) + 2(1 — ft) 

fraction of edges from H. By the definition of (cco, I, L)-legit, there are at most (1 — a*) + 2(1 — ft) fraction 
removed in the first step. To analyze the second step, call a good vertex v £ V{H ) active in a round if we 
remove £{v,u). Each good vertex is active at most £ L times, because there are at most I L nontrivial classes 
£{v,u) for every v (and we have already removed all of the trivial classes in the first step). At each good 
vertex, every time it is active, we delete at most ed edges. Thus, we have deleted a total of at most 

n ■ t L ■ ed 


edges, and this proves the claim. Finally, we observe that our choice of a* and /3 in (2) and (3) respectively 
implies that (1 — a*) + 2(1 — /3) < e£ L . Since the remaining edges belong to classes of size at least ed, this 
proves the lemma. □ 


A basic fact about expanders is that if a subset S of vertices has a significant fraction of its edges contained 
in S, then S itself must be large. This is formalized in Lemma 7. 

Lemma 7. Let H be a d-regular expander graph with normalized second eigenvalue A. Let S C V(H) with 
|S'! < (e — A )n, and F C E(H ) so that for all v £ S, 

\{e £ F : e is adjacent to v} | > ed. 


Then 

\r F (s)\>\s\. 

Proof. The proof follows from the expander mixing lemma. Let T = Yp(S). Then 

ed\S\ < E(S, T) 

< + dXV\S\\T\ 


< 


d|S||T| , dA(|5| + |T|) 


Thus, we have 

\T\>\S\ 

In particular, as long as jS 1 ! < n(e — A), we have 


e — A/2 


\S\/n + A/2 ) ' 

|T|>|S|. 


□ 
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Lemma 8 (Expanders have large global equivalence classes). If ( u , v) is sampled uniformly from E(H), then 


Pr 

(u,v) 


\ £ l 


H I 
u,v) I 


> 


nd 


e(e — A) 


> 1 - 3e£ L 


Proof. From Lemma 6, there is a subgraph if' C H such that \E(H')\ > \E(H)\(1—3e£ L ). Let (u,v) £ E(H'), 
and consider £^ v y Let S be the set of vertices in H’ that are adjacent to an edge in £^ i.e., 

S = {w £ V(H')\(w,z) £ £^ U ' V ) for some 2 ; £ V(H 1 )} 

Since every local equivalence class in H ' is of sized at least ed , then every vertex in S has at least ed edges 
in £^ v y By dehnition of S, for every edge in £^ v ^ both its endpoints are in S and T £H ' ( S ) = S. Thus 

by Lemma 7, it must be that |Sj > n(e — A). Thus 

\£{ I u , v) \> Y \S\ = Y B(e-X) 

Then any edge in H' is contained in an equivalence class of size at least ^e(£ — A), and the result follows 
form the fact that \H'\ > (1 — 3e£ L )\H\. □ 

Finally, we are in a position to prove that Algorithm 1 does what it’s supposed to. 

Lemma 9. Algorithm 1 produces a list of at most £ partial assignments so that 

1. Each of these partial assignments assigns values to the same set. Further, this set is the global equiva¬ 
lence class £y, u y where (v, u) is the initial edge given as input to Algorithm 1. 

2. For at least (1 — 3e£ L ) fraction of initial edges (v,u) we have \£y, u ^\ > e{e — \)\E(H)\ 

3. Algorithm 1 can be implemented so that the running time is Oe t L(\£y, u )\)- 

Proof. For the first point, notice that at each t in Algorithm 1, the algorithm looks at each vertex it visited 
in the last round (the set W t - 1 ) and assigns values to all edges in the (local) equivalence classes £(v,u) for 
all v £ W t ~ 1 for which at least one other value in £{v,u) was known. Thus if there is a path of length p 
from (v, u) to (z, w) walking along (local) equivalence classes, the edge (z, w) will get assigned by Algorithm 
1 in at most p steps. 

For the second point, by Lemma 8 for at least a (1 — 2 et L ) fraction of the initial starting edges, (v, u ) we 

have \ £ {V,U)\ ^ £ ( £ - A )|£(#)l- 

Finally, we remark on running time. Since each edge is only in two (local) equivalence classes (one for 
each vertex), it can only be assigned twice during the running of the algorithm (Algorithm 1 line 17). Since 
each edge can only be assigned twice, the total running time of the algorithm will be 0 (\£p J). □ 


4.3 Turning partial assignments into full assignments 

Given lists {£ e }eeE(H), Algorithm 2 runs the local recovery algorithm at each good vertex v to obtain 
(ao, I, L)-legit sets S v . Then Given (ao,£,L )~legit sets S v , Algorithm 1 can find £ partial codewords, defined 
on £y lv y To turn this into a full list recovery algorithm, we simply need to run Algorithm 1 multiple 
times, obtaining partial assignments on disjoint equivalence classes, and then stitch these partial assignments 
together. If we run Algorithm 1 t times, then stitching the lists together we will obtain at most F possible 
codewords; this will give us our final list of size L'. This process is formalized in Algorithm 2. 

The following theorem asserts that Algorithm 2 works, as long as we can choose A and e appropriately. 
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Algorithm 2: List recovery for expander codes 


1 


2 

3 

4 

5 

6 
r 
8 
9 

10 


11 

12 

13 

14 

15 

16 
IT 
18 

19 

20 

21 

22 

23 


Input: A collection of lists C e C ¥ q for at least a a* fraction of the e G E(H), \C e \ < t. 

Output: A list of C assignments c G C that are consistent with all of the lists C e . 

Divide the vertices and edges of H into good and bad vertices and edges, as per Section 4.1. Run the 
list recovery algorithm of Co at each good vertex v G V(H) on the lists : u G r(u)} to obtain 

(a 0 ,^, L)-legit lists S v C Co- 
Initialize the set of unassigned edges U = E(H). 

Initialize the set of bad edges B to be the bad edges along with the edges adjacent to bad vertices. 
Initialize T = 0. 
for t = 1,2 ,... do 
if U C B then 
| break 
end 

Choose an edge (v,u) <—U\B . 

Run Algorithm 1 on the collection {S a : a G V(H)} and on the starting edge (v,u). This returns 
a list x^ \ ... of assignments to the edges in v y ; /* Notice that the notation x^p 

differs from that in Algorithm 1 */ 

if \ £ {u,v)\ > £ ( £ - A )¥ then 

| Set U = U \ £jy and set T = T U {t} 

end 

else 


B = B\J£ 


H 

( u ,v) 


end 

end 

for t gT and j = 1,..., l do 

Concatenate the (disjoint) assignments : t G t| to obtain an assignment x. 

Run the unique decoding algorithm for erasures (as in Lemma 1) for the expander code to correct 
the partial assignment x to a codeword c G C. 

If c agrees with the original lists £ e , add it to the output list C!. 

end 

Return CJ C C. 
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Theorem 10. Suppose that the inner code Cq is (ao,£, L)-list recoverable with and has distance <5o- Let 
a > a* as in Equation (2). Choose k > 0 so that A < <5o — ^, and set e = 2 fv . Then Algorithm 2 returns 
a list of at most 

L' = £^>o 

codewords of C. Further, this list contains every codeword consistent with the lists C e . In particular, C is 
(a, £, L')-list recoverable from erasures. 

The running time of Algorithm 2 is Og } L,e(nd). 

Proof of Theorem 10. First, we verify the list size. For each t £ T, Algorithm 1 covered at least a e(e — A) 
fraction of the edges, so \T\ < Thus the number of possible partial assignments x is at most 


(\ T \ < £e(e-X, . 


Next, we verify correctness. By Lemma 8, at least a 1 — 2>eI L fraction of the edges are in equivalence 
classes of size at least (nd/ 2)e(e— A). Thus at the end of Algorithm 2 at most 3 e£ L vertices are uncorrected. 
By Lemma 1, we can correct these erasures in linear time this as long as 3e£ l < S^/k (which was our choice 
of e), and as long as A < <5 0 — § (which was our assumption). Thus, Algorithm 2 can uniquely complete all 
of its partial assignments. Since any codeword c £ C which agrees with all of the lists agrees with at least 
one of the partial assignments x, we have found them all. 

Finally, we consider runtime. As a pre-processing step, Algorithm 2 takes 0(Tdn ) steps to run the inner 
list recovery algorithm at each vertex, where Td < 0(d£\Co\) is the time it takes to list recover the inner code 
Co- 4 It takes another O(dn) steps of preprocessing to set up the appropriate graph data structures. Now 
we come to the first loop, over t. By Lemma 9, the equivalence classes £^ u v ^ form a partition of the edges, 
and at least a (1 — 3 e£ l ) fraction of the edges are in parts of size at least ^e(e — A). By construction, we 
encounter each class only once; because the running time of Algorithm 1 is linear in the size of the part, the 
total running time of this loop is O(dn). Finally, we loop through and output the final list, which takes time 
0(L'dn ), using the fact (Lemma 1) that the unique decoder for expander codes runs in linear time. .Q 

Finally, we pick parameters and show how Theorem 10 implies Theorem 2. 

Proof of Theorem 2 . Theorem 2 requires choosing appropriate parameters to instantiate Algorithm 2. In 
order to apply Theorem 10, we choose 


k > 



> 0 


and 


ft) 

3 k£ L 



ft) (ft) — A) 
U L 


This ensures that the hypotheses of Theorem 10 are satisfied. The assumption that A < 8q/( 12£ l ) and the 
bound on e implies that e — A > e/2. Thus, the conclusion of Theorem 10 about the list size reads 


L' < exp^ 


1 

e(e — A) 


< exp £ 



< exp £ 



The definition of a* from (2) becomes 


a* = 1 - e£ L 


{ 1 ~ op 
\2 - a 0 


do{do — A) / 1 — ao \ 

6 ’ 


which implies the claim about a. Along with the statement about running time from Theorem 10, this 
completes the proof of Theorem 2. □ 

4 Because d is constant, we can write T,j = 0(1), but it may be that d is large and that there are algorithms for the inner 
code that are better than brute force. 


16 
















5 Conclusion and open questions 

We have shown that expander codes, properly instantiated, are high-rate list recoverable codes with constant 
list size and constant alphabet size, which can be list recovered in linear time. To the best of our knowledge, 
no such construction was known. 

Our work leaves several open questions. Most notably, our algorithm can handle erasures, but it seems 
much more difficult to handle errors. As mentioned above, handling list recovery from errors would open 
the door for many of the applications of list recoverable codes, to list-decoding and other areas. Extending 
our results to errors with linear-time recovery would be most interesting, as it would immediately lead to 
optimal linear-time list-decodable codes. However, even polynomial-time recovery would be interesting: in 
addition to given a new, very different family of efficient locally-decodable codes, this could lead to explicit 
(uniformly constructive), efficiently list-decodable codes with constant list size and constant alphabet size, 
which is (to the best of our knowledge) currently an open problem. 

Second, the parameters of our construction could be improved: our choice of inner code (a random linear 
code), and its analysis, is clearly suboptimal. Our construction would have better performance with a better 
inner code. As mentioned in Remark 1, we would need a high-rate linear code which is list recoverable with 
constant list-size (the reason that this is not begging the question is that this inner code need not have a 
fast recovery algorithm). We are not aware of any such constructions. 
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A Linear-time unique decoding from erasures 

In this appendix, we include (for completeness) the algorithm for uniquely decoding an expander code from 

erasures, and a proof that it works. Suppose C is an expander code created from a d-regular graph H and 

inner code Co of length d, so that the inner code Co can be corrected from an Sod erasures. 


Algorithm 3: A linear time algorithm for erasure recovery 

Input: Input: A vector w £ {F g U 

Output: Output: A codeword c £ C 
l Initialize Bq = V(H) 

2 for 

t = 1,2,... do 

3 

if B t ~i = 0 then 

4 


Break 

5 

end 

6 

B t =0 

7 

for v £ B t ~\ do 

8 


if {w : (u, v) £ E(H),W(u tV ) = _!_} < Sod then 

9 


[ Correct all (u, v) £ r(u) using the local erasure recovery algorithm 

10 


end 

11 


else 

12 


| B t =B t UM 

13 


end 

14 

end 

is end 


16 Return the corrected word c. 


Lemma 11 (Restatement of Lemma 1). If Co is a linear code of block length d that can recover from an Sod 
number of erasures, and H is a d-regular expander with normalized second eigenvalue X, then the expander 
code C can be recovered from a fraction of erasures in linear time using Algorithm 3 whenever A < So — f- • 

Proof. Since there are at most jf\E(H)\ erasures, at most j.\V(H)\ of the nodes are adjacent to at least S 0 d 
erasures. Thus \B\\ < ^\V(H)\. 

By the expander mixing lemma 

I< dlgt ~ lllgtl +Adv/|g t -i||gf| (4) 

n 
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On the other hand, in iteration t — 1 of the outer loop, every vertex in Bt has at least Sod unknown edges, and 
these edges must connect to vertices in B t .-i (since at step t — 1 all vertices in V(H) \ B t ~ 1 are completely 
known). 

Thus 

\E(Bt-i,B t )\>5 0 d\B t \ (5) 

Combining equations 4 and 5 we see that 


Sod\B t \ < d ^y^ tl + XdV\Bt-i\\Bt 


4 

<5o < 

\Bt\< 

m< 


I B t ~ 


\vm 




\B t ~i 

m 


A 2 | Bt-! 


(x _ IBt-il V 

v° wm) 
\ 2 \B t -i\ 

(*> -If 


Where the last line uses the fact that > B\ > B 2 > ■ ■ ■ ■ 

Thus at each iteration, B t decreases by a multiplicative factor of ^ ) ■ This is indeed a decrease as 

long as A < (5 q — Since \Bo\ = \V(H)\, after T > , we have \Bt\ < 1- Thus the algorithm 


2 log 


terminates after at most T iterations of the outer loop. 
The total number of vertices visited is then 


\ B t\ < \V{H)\ • 


t =0 


t =0 


<im)i-E 


0 


So — 


So — 


2 1 


= \V(H)\ 


1 - ^ 

<5o — 7 


Thus the algorithm runs in time 0(\V(H)\). 


□ 


B List recovery capacity theorem 

In this appendix, we prove an analog of the list decoding capacity theorem for list recovery. 

Theorem 12 (List recovery capacity theorem). For every R > 0, and L > £, there is some code C of rate 
R over F 9 which is (R + r)(£,L),£,L)-list recoverable, for any 


4 / 

v(e,L)>- 


and 


q ^ 


> £ 2 / r >. 


Further, for any constants 77 , R > 0 and any l, any code of rate R which is (R — r], £, L)-list recoverable must 
have L = . 
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Proof. The proof follows that of the classical list-decoding capacity theorem. For the first assertion, consider 
a random code C of rate R, and set a = R + r], for rj = rj(£, L) as in the statement. For any set of L + 1 
messages A C F fe , and for any set T C [n] of at most an indices i, and for any lists S» of size i, the probability 
that all of the codewords C{x) for x G A are covered by the S, is 


, f \\T\(L+\) 

P{Vi G T,x G A,C(x)i G Si} = f-J 
Taking the union bound over all choices of A, T, and Si, we see that 


P {C is not (a, i, A)-list recoverable } < 


qRn 

L + l 







< exp 9 (n {{R - a)(L + 1 ) + a (i + L log(£)/log (q)) + H(a)/ log(g))) 

< exp 9 (n (—L77/2 + at)) 

< exp 9 {-nLr)/ 4 ) 

< 1 . 


In particular, there exists a code C of rate R which is (a, l, L)-list recoverable. 

For the other direction, fix any code C of rate R, and choose a random set of an indices T C [n], and for 
all i G T, choose S t C F g of size t uniformly at random. Now, for any fixed codeword c G C, 

P {ci C Si\/i G T} = 



Thus, 


E|{cGC 


c 8 G 5<Vi G T}\ = q Rn 


p \ an 

_ | > q(R-<*)n 

qj 


In particular, if a = R — 77 , then this is q r,n . 


□ 
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